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Abstract 

We give an account of some results, both old and new, about any 
n X n Markov matrix that is embeddable in a one-parameter Markov 
semigroup. These include the fact that its eigenvalues must lie in a 
certain region in the unit ball. We prove that a well-known procedure 
for approximating a non-embeddable Markov matrix by an embeddable 
one is optimal in a certain sense. 
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1 Introduction 

A Markov matrix A is defined to be a real n x n matrix with non-negative 
entries satisfying J2]=i = 1 for all i. The spectral properties of non- negative 
matrices and linear operators and in particular of Markov matrices have been 
studied in great detail, because of their great importance in finance, population 
dynamics, medical statistics, sociology and many other areas of probability 
and statistics. Theoretical accounts of parts of the subject may be found in 
[H El Uni |17]. This paper develops ideas of [lO], which investigated when the 
pth roots of Markov matrices were also Markov; this problem is related to 
the possibility of passing from statistics gathered at certain time intervals, for 
example every year, to the corresponding data for shorter time intervals. 

Given an empirical Markov matrix, three major issues discussed in [T^] are 
embeddability, uniqueness of the embedding and the effects of data/sampling 
error. All of these are also considered here. We call a Markov matrix A 
embeddable if there exists a matrix B such that A = and e^^ is Markov 
for all t > 0. The matrix B involved need not be unique, but it must have 
non-negative off-diagonal entries and all its row sums must vanish; see [Tj or 
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[SI section 12.3]. In probabilistic terms a Markov matrix A is embeddable if it 
is obtained by taking a snapshot at a particular time of an autonomous finite 
state Markov process that develops continuously in time. On the other hand 
a Markov matrix might not be embeddable if it describes the annual changes 
in a population that has a strongly seasonal breeding pattern; in such cases 
one might construct a more elaborate model that incorporates the seasonal 
variations. Embeddability may also fail because the matrix entries are not 
accurate; in such cases a regular izat ion technique might yield a very similar 
Markov matrix that is embeddable; see [iSj for examples arising in finance. 

Theorem [9] describes some spectral consequences of embeddability. The earli- 
est analysis of the structure of the set S of embeddable nx n Markov matrices 
and its topological boundary in the set of all Markov matrices was given by 
Kingman [T3] , who concluded that except in the case n = 2 it seemed unlikely 
that any very explicit characterisation of S could be given; see [12] for further 
work on this problem. Theorem [13] proves that a well-known method of ap- 
proximating a Markov matrix by an embeddable Markov matrix is optimal in 
a certain sense. Many of the results in the present paper appear in one form 
or other in papers devoted to the wide variety of applications, and it is hoped 
that collecting them in one place may be of value. 

2 The main theorem 

For the sake of definiteness we define the principal logarithm of a number z G 
C\(— oo, 0] to be the branch of the logarithm with values in {w : |Im(w)| < vr}. 
We define the principal logarithm of an n x n matrix A such that Spec (A) fl 
(— cx), 0] = to be that defined by the functional integral 



using the principal logarithm of z and a simple closed contour 7 in C\(— 00, 0] 
that encloses the spectrum of A. This formula goes back to Giorgi in 1926; see 
[6] Theorem VII. 1.10 and notes, p. 607]. li A = TDT~^ where D is diagonal, 
this is equivalent to \og{A) = Tlog{D)T^^ where log{D) is obtained from D 
by applying log to each diagonal entry of D. The non-diagonalisable case is 
discussed in some detail in [TP] and yields the same matrix as ([T]). 

Lemma 1 If A is a Markov matrix and Spec{A) fl (—00, 0] = then the prin- 
cipal logarithm L = log{A) lies in the set C of all real n x n matrices L such 
that J2i<j<n Lij = for every i. 




(1) 
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Proof We use the formula ([T]) and take the contour 7 to be symmetrical with 
respect to reflection about the x-axis. The statements of the lemma follow 
directly from two properties of the resolvent matrices. 

The first is the identity 

{{zl - Ar\, = {zl - A)-%, (2) 

This holds for large |-2| by virtue of the identity 

00 

{zI-A)-' = z-'Y.i^/zr (3) 

n=0 

and ([2]) then extends to all z ^ Spec (A) by analytic continuation. 
The second identity needed is 

{zi-Ayh = {z-i)-H, 

whose proof follows the same route, using ([3]) and analytic continuation. ■ 

The results in our next lemma are all well known and are included for com- 
pleteness. 

Lemma 2 If A is embeddable then is not an eigenvalue of A and every 
negative eigenvalue has even algebraic multiplicity. Moreover det{A) > 0. If A 
is embeddable and Aij > 0, ^ > then Ai^^ > 0. 

Proof The first statement follows from the fact that 

Spec(A) = exp(Spec(i?)). 

Given an eigenvalue A < of A let 

5+ = {ze Spec(5) : e" = A and lm{z) > 0}, 
= {zE Spec(S) : e^ = A and lm{z) < 0}, 

and let C± be the spectral projections of B associated with 5*^. Since e^ = A 
implies that lm{z) 7^ 0, we can deduce that £_n£+ = and that A4 = 
is the spectral projection of A associated with the eigenvalue A. Since B is real 
may be obtained from by complex conjugation, so 

dimA^ = dim(£+) + dim(/:_) = 2dim(£+). 

See [7]. 

By combining the reality of B with the formula det(A) = e*''*^'^^ we obtain 
det{A) > 0. See [H]. 
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The last statement follows from the general theory of Markov chains and is 
due to Ornstein and Levy, independently; see [U Section 2.5, Theorem 2] and 
O Theorem 13.2.4]. We first note that one may write B = C — 61 where all 
the entries of C are non- negative and 6 > 0. Hence 

oo 



n=0 



where each entry of each C"" is non-negative. This implies that if {e > 
for some t > the same holds for all t > 0. This quickly yields the final 
statement. ■ 

Kingman [H] has shown that the set £ of embeddable Markov matrices is a 
closed subset of the set of all n x n Markov matrices. The matrix norm used 
throughout this paper is 

||M|| = max{||Mt;|U : ||t;||oo < 1} = max (j]" \Mij\] . (4) 

l<i<n — ^j=l J 



Lemma 3 The set S of all A & £ with no negative eigenvalues is a dense 
relatively open subset of £. 

Proof If A & S then a simple perturbation theoretic argument implies that 
there exists e > such that C has no negative eigenvalues for any nxn matrix 
satisfying ||y4 — C\\ < e. This implies that S is relatively open in £. 

If A & £ then A = for some Markov generator B. If {xr + iyr}r=i is the set 
of eigenvalues of B and t > then e^* has a negative eigenvalue if and only if 
tyr = 7r(2m + 1) for some r and some integer m. The set of such t is clearly 
discrete. It follows that e^* G S for all t close enough to 1 except possibly for 
1 itself. Since lim^^i Hc^* — A\\ = 0, we conclude that S is dense in £. 

■ 

The following example shows that the density property in Lemma [3] depends 
on the embeddability hypothesis. 

Example 4 

The Markov matrix 

r 1/3 2/3 
[ 2/3 1/3 

has Spec(y4) = {1, —1/3}. If < £ < 1/3, any matrix close enough to A also 
has a single eigenvalue A satisfying |A + 1/3| <ebya standard perturbation 
theoretic argument. Since A has real entries the complex conjugate of A is also 



4 



an eigenvalue, so A must be real and negative. Therefore the set of Markov 
matrices with no negative eigenvalues is relatively open but not dense in the 
set of all Markov matrices, at least for n = 2. The example may be used to 
construct a similar example for every n > 2. ■ 

We will need Lemma |5] and its corollary in the proof of Theorem [71 

Lemma 5 There exists a polynomial p in the coefficients of an n x n matrix 
A such that A has a multiple eigenvalue in the algebraic sense if and only if 
p = 0. We call p the discriminant of A. 

Proof A has a multiple eigenvalue if and only if its characteristic polynomial 
q{z) = + aiz^~^ + . . . + a„ has a multiple root; the coefficients of q are 
themselves polynomials in the entries of A. Moreover q has a multiple root 
if and only if its discriminant (the square of its Vandermonde determinant) 
vanishes, and the discriminant of g is a polynomial in ai, . . . , a„. ■ 

Corollary 6 If Aq, Ai are twonxn matrices then either A ^ = {1 — z)Aq + zAi 
has a multiple eigenvalue for all z & C or this happens only for a finite number 
of z. 

Proof The discriminant of A^ is a polynomial in z, which has a finite number 
of roots unless it vanishes identically. ■ 

Theorem 7 The set T of all n x n embeddable Markov matrices that have n 
distinct eigenvalues is relatively open and dense in the set £ of all embeddable 
Markov matrices. 

Proof A standard argument from perturbation theory establishes that T is 
relatively open in so we only need to prove its density. 

Let A = where Bq is a Markov generator, and let e > 0. Then put 
Bt = {l-t)Bo + tBi where 

if r = s, 
if r + 1 = s, 
if r = n, s = 1, 
otherwise. 

One sees immediately that Bt is a Markov generator for all t G [0, 1] and that it 
has n distinct eigenvalues ift = 1. Corollary [6] now implies that the eigenvalues 
of Bt are distinct for all sufficiently small t > 0. By further restricting the size 
of t > we may also ensure that ||e^* — A\\ < e/2. 

Having chosen t, we put B = sBf where s G R is close enough to 1 so that 
IjgBt — e^ll < e/2; we also choose s so that if Ai, A2 are any two eigenvalues of 
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Bt then s(Ai — A2) ^ 27iiZ. These conditions ensure that He"^ ~ ^|| < and 
that has n distinct eigenvalues. ■ 

The following lemma may be contrasted with the fact that a complex number 
A such that |A| = 1 is the eigenvalue of some nxn Markov matrix if and only if 
A*" = 1 for some r G {1,2, ... ,n}; see [161 Chap 7, Theorem 1.4]. Permutation 
matrices provide examples of such spectral behaviour. The lemma has been 
extended to an infinite-dimensional context in [1]. 

Lemma 8 (Elfving, [7J) If A is an embeddable Markov matrix and X ^ 1 is 
an eigenvalue of A then |A| < 1. 

Proof Our hypotheses imply by [3 Lemma 12.3.5] that Spec(y4) = exp(Spec(5)) 
where B = c{C — /), c > and C is a Markov matrix. Since C is a contraction 
when considered as acting in C" with the l°° norm, Spec(C) C {z : \z\ < 1}. 
Therefore every eigenvalue A of 5 except satisfies Re(A) < 0. The lemma 
follows. ■ 

The main application of the following theorem may be to establish that certain 
Markov matrices arising in applications are not embeddable, and hence either 
that the entries are not numerically accurate or that the underlying process is 
not autonomous. The theorem is a quantitative strengthening of Lemma [HI It 
is of limited value except when n is fairly small, but this is often the case in 
applications. 

Theorem 9 (Runnenberg, [ 18|, 119] ) If n > 3 and the nxn Markov matrix 
A is embeddable then its spectrum is contained in the set 

{re'^ : -n <e <7T,0 <r < r{e)} 

where 

r{9) = exp(— tan(7r/ra)). 

Proof This depends on two facts, firstly that Spec(74) = exp(Spec(-B)) where 
B = c{C — I), c > and C is a Markov matrix. Secondly 

Spec(C-/) C {z ■.\aig{z)\>n/2 + n/n} (5) 
= {—u + iv:u>0,\v\<ucot{7i/n)}. (6) 

by applying a theorem of Karpelevic to C and then deducing 

([6]) from that; see [13] or [T6l Chap. 7, Theorem 1.8]. The relevant boundary 
curve (actually a straight line segment from 1 to e^'^*/") is the case q = 0, p = 1 
and r = n of A''(A^ — ty = (1 — ty, where < t < 1. The small part of the 
theorem of Karpelevic that we need was proved by Dmitriev and Dynkin; see 
[ISl Chap. 8, Theorem 1.7]. ■ 
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We turn now to the question of uniqueness. The first example of a Markov 
matrix A that can be written in the form A = for two different Markov 
generators was given by Speakman in [20]; Example ITTl provides another. The 
initial hypothesis of our next result holds for most embeddable Markov matrices 
by Theorem [71 

Corollary 10 Let A be an invertible nxn Markov matrix with distinct eigen- 
values Ai, . . . , A„. 

1. The solutions of = A form a discrete set and they all commute with 
each other and with A. 

2. Only a finite number of the solutions ofe^ = A can be Markov generators. 

3. If 

|A,.| > exp(— 7rtan(7r/r2)) (7) 

for all r then only one of the solutions of = A can be a Markov 
generator, namely the principal logarithm. 

Proof 

1. Since Spec(A) = exp(Spec(-B)), each B must have n distinct eigenvalues 
III, . . . , fin and the corresponding eigenvectors form a basis in C". These 
eigenvectors are also eigenvectors for A and 

A, = e'^'^ (8) 

for all r. It follows that B can be written as a polynomial function of A. 
For each A^, the equation ([S]) has a discrete set of solutions /i^. 

2. If A is an invertible Markov matrix with distinct eigenvalues and the 
solution B of = A is a Markov generator then every eigenvalue fir of 
B lies in the sector {—u + iv : u > 0,\v\ < ucot{TT/n)} by Combining 
this restriction on the imaginary parts of the eigenvalues with ([S]) reduces 
the set of such i? to a finite number. See pT] Theorem 6.1] for items 1 
and 2, and for an algorithm implementing item 2. 

3. We continue with the assumptions and notation of item 2. The assump- 
tion ([7]) implies that if /i^ = —Ur + ivr then Ur < 7rtan(7r/?T,). Item 2 now 
yields \vr\ < vr. Hence fir is the principal logarithm of A,- and B is the 
principal logarithm of A. 
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The conclusions of the above corollary do not hold if A has repeated eigenvalues 
or a non-trivial Jordan form; see [21 [II]- For example the nxn identity matrix 
has a continuum of distinct logarithms B which do not all commute; if the 
eigenvalues of B are chosen to be {27rri : 1 < r < n}, then the possible B are 
parametrized by the choice of an arbitrary basis as its set of eigenvectors. The 
general classification of logarithms is given in [8] and [9l Theorem 1.28]. These 
comments reveal a numerical instability in the logarithm of a matrix if it has 
two or more eigenvalues that are very close to each other. 

The following provides a few other conditions that imply the uniqueness of a 
Markov generator B such that A = e^. 

Theorem 11 (Cuthbert, |[2|, |3]) Let A = where B a Markov generator. 
Then where 

e"^ < det(A) < 1, (9) 
-TT < tr(5) < 0, (10) 
\\B + fiI\\ < TT, where P = max {\Bii\}, (11) 

l<i<n ' 

Spec(5) C {z : Im(z)| < tt}. (12) 

If A is a Markov matrix that has distinct eigenvalues and det{A) > e~'^ then 
its only possible Markov generator is its principal logarithm log{A). 

Proof 

Q^dlDD This uses det{A) = e^'^^l 

( ITU]) ^( ITT]) This uses the fact that B + (31 has non- negative entries and its row 
sums all equal /3, which satisfies (3 < ir. 

dllD^dn]) follows directly from Spec(5 + /SI) C {z : \z\ < vr}. 

The final statement of the theorem follows the proof of Corollary [TOl ■ 

If A^ is a one-parameter Markov semigroup then for every t > one may 
define L{t) to be the number of Markov generators B such that e^* = A*. 
Some general theorems concerning the dependence of L{t) on t may be found 
in ^m\. 



3 Regularization 

Let Q denote the set of n x n Markov generators; following the notation of 
Lemma m Q is the set of G G £ such that Gij > whenever i j. 

Let A be a Markov matrix satisfying the assumptions of Lemma [H for which 
L = log(A) does not lie in Q. There are several regularizations of L, that is 
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algorithms that replace L by some G & Q that are (nearly) as close to L as 
possible. Kreinin and Sidelnikova [15J have compared different regularization 
algorithms for several empirical examples arising in finance and it appears that 
they all have similar accuracy. The best approximation must surely depend on 
the matrix norm used, but if one considers the physically relevant matrix norm 
dl]) then we prove that the simplest method, called diagonal adjustment in [TH] , 
also produces the best possible approximation. We emphasize that although 
^ is a closed convex cone, this does not imply that the best approximation is 
unique, because the matrix norm (jl]) is not strictly convex. 

Theorem 12 Let L G C and define B E Q by 



^. . ^ i ifi^i and Lij > 

''■^ 1 ifiy^j and Lij < 



together with the constraint J2]=i Bij = for all i. Then 

\\L-B\\ =min{||L-G|| -.G eG}. 

Proof It follows from the definition of the matrix norm that we can deal with 
the matrix rows one at a time. We therefore fix i and put 

f ■ — T 

P = {j:jV*and£, >0}, 
N = {j : j y^i and ij < 0}, 



SO that ii = In — (-p- We next put hj 
statement of the theorem. Thus 




A direct calculation shows that 



= Bij, where B is defined as in the 

if J GP, 
ifjGiV, 
if i = i. 



r if J G p, 

ij -bj = < £j if J G N, 
[ In if j = i- 

Therefore 

\\i-b\\, = 2iN. 
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Finally given G G ^ we define gj = Gij for all j. We have 

> j:\^j-9,\+^n 

+ E^. -EK.-^?.I 

= 2iN. 



Theorem 13 Let A be a Markov matrix such that Spec(A) fl (—00, 0] 
put L = log(A). If B eg and \\L - B\\ = e then 



and 



\\A - e^ll < min{2, e^ - 1} < min{2, 2e}. 
Proof If we put E = L — B then the series expansion 

e^ = e^+ e^(i-*)Ee^*dt+ r /* e^^^-^^Ee^^^-^^Ee^Msdt + . . . 

Jt=0 Jt=Q Js=0 

given in [F, Theorem 11.4.1] yields 

p-e^ll = ||e^-e^|| 

< \\E\\ + \\Er/2\ + ... 
= ell^ll-1. 

The other part of the estimate uses ||y4|| = 1 and lle"^!! = 1. 



4 Some Numerical Examples 



Example 14 

The Markov matrix 



A 



0.30 0.45 0.25 
0.14 0.84 0.02 
0.14 0.52 0.34 
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has eigenvalues 1,0.32,0.16, exactly. The matrix L — log(^) is given to four 
decimal places by 



-1.5272 0.5991 0.9281 
0.3054 -0.2371 -0.0683 
0.3054 0.9023 -1.2078 



and has a negative off-diagonal entry. The closest Markov generator S to L as 
described above is 



B 



-1.5272 0.5991 0.9281 
0.3054 -0.3054 
0.3054 0.9023 -1.2078 



and the embeddable Markov matrix A = (where B is entered to full preci- 
sion) is given by 

" 0.3000 0.4383 0.2617 

0.1400 0.8046 0.0554 
0.1400 0.5057 0.3543 

One observes that all the entries of ^4 — ^4 are less than 0.036 in absolute value. 



The following exactly soluble example illustrates the use of some of our theo- 
rems. 

Theorem 15 Let 

"-1-S 1 s 
s -1-s 1 
1 s -1-s 

where s &H, and let As — e^^ . Then 

i- //s > then As is an embeddable Markov matrix. 

2. If s < a ^ —0.5712 then As has at least one negative entry. 

3. If a < s < then Ag is Markov but not embeddable. 

Proof We first note that L^l = for every s e R, so A^l = 1. 

Item 1 follows from the fact that satisfies all the conditions for the generator 
of a Markov semigroup. 

To prove item 2 we note that Lg = —{l + s)I + F + sB where F, B are 
permutation matrices that commute. Let S be the set of all s such that e-''^ is 
non-negative. If t & S and s >t then 

= Lt + (s - t)B - (s - t)I 
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where all the matrices commute, and 

^L. = e-(^-*)e^'e^(^-*) > 0, 

so s G 5. Therefore S is an interval, which is obviously closed. Numerical 
calculations show that the smallest number cr G iS is approximately —0.5712. 
More rigorously if s < — 1 then 

det(A,) = e*'"^^') = e-^(^+^) > 1 

so As cannot be a Markov matrix and must have a negative entry. This estab- 
lishes that a > — 1. 

Clearly Lg is not a Markov generator if a < s < 0. We prove item 3 by 
obtaining a contradiction from the existence of a Markov generator Bg with 
e-^" = As. Since 

exp(Spec(Ls)) = Spec(y4s) = exp(Spec(i?s)) 

we conclude that every eigenvalue of Ls differs from an eigenvalue of Bs by an 
integral multiple of 27ri. A direct computation shows that 

Spec(L.)^{0.-5(l±il±^Sizi)!}. 

For s in the stated range, each non-zero eigenvalue A of Lg satisfies | arg(A) | < 
5tt/6 and the same applies if one adds an integral multiple of 27ri to the eigen- 
value. Hence each non-zero eigenvalue A of Bs satisfies |arg(A)| < Svr/G and 
(E]) implies that Bs cannot be a generator. ■ 

Example 16 

The following illustrates the difficulties in dealing with Markov matrices that 
have negative eigenvalues. If c = 27r/v^ and 



B 



-1 1 

-1 1 

1 -1 



then the eigenvalues of B are 0, —^/Sir ± rri. The matrix A = is self-adjoint 
with eigenvalues 1, —e~^'^, —e~^'^. If one uses Matlab's 'logm' command to 
compute log(y4), one obtains a matrix with complex entries that is not close 
to fi; it might be considered that 'logm' should produce a real logarithm of a 
real matrix if one exists, but it is not easy to see how to achieve this. 

Example 17 



12 



We continue with the above example, but with the more typical choice c — A. 
The eigenvalues of B are now 0, — 6±3.4641i. Clearly A — e^is an embeddable 
Markov matrix. If one rounds to four digits one obtains 

0.3318 0.3337 0.3346 
0.3346 0.3318 0.3337 
0.3337 0.3346 0.3318 



The use of 'logm' yields 



log(A) 



-4 0.3724 3.6276 
3.6276 -4 0.3724 
0.3724 3.6276 -4 



Br,s — ' 



which is also a Markov generator. We conclude that A is an embeddable 
Markov matrix in (at least) two distinct ways. 

This is not an instance of a general phenomenon. If one defines the 5x5 cyclic 
matrix B by 

—4 if r = s, 
4 if r + 1 = s, 
4 if r = 5, s = 1, 
otherwise, 

then S is a Markov generator with eigenvalues 0, —7.2361 ±2. 3511i, — 2.7639± 
3.8042i. However L — log(exp(S)), with the principal choice of the matrix 
logarithm, is a cyclic matrix with some negative off-diagonal entries, so it 
cannot be a Markov generator. ■ 
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